AITopics | pure exploration

Collaborating Authors

pure exploration

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multi-task Representation Learning for Pure Exploration in Bilinear Bandits

Neural Information Processing SystemsFeb-15-2026, 23:38:47 GMT

Bilinear bandits (Jun et al., 2019; Lu et al., 2021; Kang et al., 2022) are an important class of

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.94)

Add feedback

Sequential Experimental Design for Transductive Linear Bandits

Tanner Fiez, Lalit Jain, Kevin G. Jamieson, Lillian Ratliff

Neural Information Processing SystemsFeb-12-2026, 21:14:22 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, allocation, sample complexity, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > Canada (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.48)

Add feedback

60cb558c40e4f18479664069d9642d5a-Paper.pdf

Neural Information Processing SystemsFeb-12-2026, 09:01:34 GMT

We determine the sample complexity of pure exploration bandit problems with multiple goodanswers.

artificial intelligence, big data, data mining, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.05)
Oceania > Australia > New South Wales > Sydney (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.05)
(4 more...)

Technology:

Information Technology > Artificial Intelligence (0.94)
Information Technology > Data Science > Data Mining > Big Data (0.35)

Add feedback

Asymptotically Optimal Quantile Pure Exploration for Infinite-Armed Bandits

Neural Information Processing SystemsFeb-10-2026, 13:59:54 GMT

We study pure exploration with infinitely many bandit arms generated i.i.d.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre:

Research Report (0.67)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

3ae8a7d6fc6d0d45e7c1ad9d4b063a01-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 13:14:35 GMT

algorithm, proc, sample complexity, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Taiwan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Multi-task Representation Learning for Pure Exploration in Bilinear Bandits

Neural Information Processing SystemsDec-26-2025, 09:09:36 GMT

We study multi-task representation learning for the problem of pure exploration in bilinear bandits. In bilinear bandits, an action takes theform of a pair of arms from two different entity types and the reward is a bilinear function of the known feature vectors of the arms. In the \textit{multi-task bilinear bandit problem}, we aim to find optimal actions for multiple tasks that share a common low-dimensional linear representation. The objective is to leverage this characteristic to expedite the process of identifying the best pair of arms for all tasks. We propose the algorithm GOBLIN that uses an experimental design approach to optimize sample allocations for learning the global representation as well as minimize the number of samples needed to identify the optimal pair of arms in individual tasks. To the best of our knowledge, this is the first study to give sample complexity analysis for pure exploration in bilinear bandits with shared representation. Our results demonstrate that by learning the shared representation across tasks, we achieve significantly improved sample complexity compared to the traditional approach of solving tasks independently.

multi-task representation learning, name change, pure exploration, (2 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.77)
Information Technology > Data Science > Data Mining (0.60)

Add feedback

Pure Exploration in Kernel and Neural Bandits

Neural Information Processing SystemsDec-24-2025, 04:56:55 GMT

We study pure exploration in bandits, where the dimension of the feature representation can be much larger than the number of arms. To overcome the curse of dimensionality, we propose to adaptively embed the feature representation of each arm into a lower-dimensional space and carefully deal with the induced model misspecifications. Our approach is conceptually very different from existing works that can either only handle low-dimensional linear bandits or passively deal with model misspecifications. We showcase the application of our approach to two pure exploration settings that were previously under-studied: (1) the reward function belongs to a possibly infinite-dimensional Reproducing Kernel Hilbert Space, and (2) the reward function is nonlinear and can be approximated by neural networks. Our main results provide sample complexity guarantees that only depend on the effective dimension of the feature spaces in the kernel or neural representations. Extensive experiments conducted on both synthetic and real-world datasets demonstrate the efficacy of our methods.

kernel and neural bandit, name change, pure exploration, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Challenging Common Assumptions in Convex Reinforcement Learning

Neural Information Processing SystemsDec-23-2025, 21:01:37 GMT

The classic Reinforcement Learning (RL) formulation concerns the maximization of a scalar reward function. More recently, convex RL has been introduced to extend the RL formulation to all the objectives that are convex functions of the state distribution induced by a policy. Notably, convex RL covers several relevant applications that do not fall into the scalar formulation, including imitation learning, risk-averse RL, and pure exploration. In classic RL, it is common to optimize an infinite trials objective, which accounts for the state distribution instead of the empirical state visitation frequencies, even though the actual number of trajectories is always finite in practice. This is theoretically sound since the infinite trials and finite trials objectives are equivalent and thus lead to the same optimal policy. In this paper, we show that this hidden assumption does not hold in convex RL. In particular, we prove that erroneously optimizing the infinite trials objective in place of the actual finite trials one, as it is usually done, can lead to a significant approximation error. Since the finite trials setting is the default in both simulated and real-world RL, we believe shedding light on this issue will lead to better approaches and methodologies for convex RL, impacting relevant research areas such as imitation learning, risk-averse RL, and pure exploration among others.

convex reinforcement learning, name change, reinforcement learning, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback